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1 . INTRODUCTION 

It is well known that the extent of whitecap cover on the surface of a 
sea is greatly influenced by the surface windspeed (Monahan (1971), Toba and 
Chaen (1973), Wu (1979), Monahan and O' Muircheartaigh (1980)). Other 
variables, such as sea surface temperature, also are important, but 
windspeed action appears to play the dominant role. Whitecap cover can be 
remotely sensed while windspeed cannot, so it is tempting to utilize the 
relationship between windspeed and whitecaps to infer reasonable values for 
the surface windspeed. To do so requires that the natural causative 
relation of "whitecaps windspeed", quantitatively estimated from field data 
as a statistical regression of (some measure of) white cap coverage on 
windspeed, be reversed. It turns out that "natural" way of solving the 
problem, namely by regressing whitecap cover on windspeed and then 
inverting that regression relation, actually produces results that are 
inferior to those from some other procedures. Since the indirect remote 
sensing of windspeed is of operational interest, and since similar problems 
may well arise in different remote sensing, and other, areas we present 



illustrative statistical data analyses of several sets of whitecaps- 
windspeed data in this paper. We also include, in later sections of the 
paper, further similar analyses based on simulated data. 

The general problem considered here is that of making inferences about 
an unknown px1 vector X' from a single random observed qxi response vector 
Y' . The relationship between Y and X is calibrated with experimental data 
(Y^,X^), i = 1,2, ...,n where Y^, X^ are qxi and pxl vectors, respectively. 
The case p = q = 1 has been extensively discussed in the literature, and 
reference will be made below to several basic contributions to calibration 
methods for this case. The situation when at least one of p,q is greater 
than one is the subject of a comprehensive paper by Brown (1982). 

Brown (1982) distinguishes two cases of interest: (a) when both X and 

Y are random and (b) when only Y is random, and X can be fixed at prechosen 
levels. The former case is called random cal ibr ation , and the latter 
controlled calibration . The present paper is concerned solely with the 
problem of random calibration, because the data of interest arises in an 
observational context, and not from a controlled experiment. 

A brief outline of the paper is as follows: In Section 2 we describe 

several different plausible methods of point estimation in univariate 
calibration. The methods described are subsequently applied to four data 
sets, and their performance evaluated in Section 3* In Section 4 we 
consider four interval estimates associated with the calibration problem, 
and apply them to the data sets. The problem of multivariate calibration is 
examined in Section 5. Several of the univariate methods are extended to 
this situation and applied to the same four data sets, and to a further set 
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provided in Brown (1982). The application and an evaluation of the results 
are presented in Section 6, 

The later sections of the paper consider the same problems, but in the 
context of a simulation study. Section 7 gives a brief description of the 
objectives of the simulation study, Section 8 describes the point estimation 
results, and Section 9 those related to interval estimation. 

2. THE UNIVARIATE PROBLEM 

The simplest version of the calibration problem, and the one most 
extensively discussed, is the case p = q = 1, and where the calibration 
curve is linear in both the parameters and the independent variable. The 
situation of interest may therefore be described as follows: given two 

random variables X,Y with the relationship 

Y = a + BX + e (2.1) 

2 

where, most classically e ~ N(0,o ), and given n independent pairs of 
observations (X^,Y^) on (X,Y) and a new observation y^^ on Y , how do we 
predict or estimate the corresponding value of X = X(y^). Numerous 
solutions have been proposed, and their performances evaluated. Five of 
these methods, in particular, have been applied in Section 3 to four data 
sets, that relate whitecap cover to surface windspeed. The four methods 
examined are these: 

(i) the so-called classical method viz., estimate a, 6 in equation (2.1) 
by least squares, and then for Y = y^ the predicted value of X is 
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B ^ 0 . 



( 2 . 2 ) 



X, 




C 



(ii) Krutchkoff (1967) suggested another estimator obtained by rewriting 
(2.1) as 



and obtaining least-squares estimators Y, 6 of Y, 6; the predicted value of 
X, given Y = will then be 



so denoted because it is known as the inverse estimator . 

Krutchkoff (1967) concluded by means of a Monte Carlo study that had 

uniformly smaller mean squared-error (MSE) than the classical estimator X^^ . 
In a later (1969) paper he concluded that this result was valid only within 
the calibration range, whereas, in fact, the reverse result held outside 
that range. Williams (1969) pointed out that for finite samples the MSE of 
the classical estimator was infinite and that of the inverse estimator 
finite, thus the use of the MSE for comparing these estimators is 
unsatisfactory . 

(iii). Lwin & Maritz (1980) proposed an alternative estimator based on 
the fact that for this particular problem, the predictor of X^ given by 



X 



Y + 6Y + e 



(2.3) 



X 



I 
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X^Yq) 



E{X|y=yQ} 



(2. it) 



has minimum mean-squared error (provided o and a, 6 are all known). By 
using consistent estimators of o, a, and 6 and by approximating the marginal 
distribution of X with the corresponding empirical distribution function, 
Lwin & Maritz showed that the estimator 



n /V A 

I Xi^UXo - a - 6x.)/a} 

i=1 



n 

I f[ 

i=1 



yp ' ; ' 



(2.5) 



will, subject to easily satisfied regularity conditions, tend to the optimal 
^ * 

estimator X (Yq) in mean square, where f is the error density function 
(presumed known; otherwise estimated). 

(iv) A Bayesian methodology was introduced by Aitchinson & Dunsmore 
(1975). This method involves the assumptions that 

(a) X,Y are Normal, 

(b) Y - N(a + 6x,o^) 

From these assumptions, it can be shown that the predictive distribution for 
Xq, given n pairs of observations (X^,Y^) and a single new observation is 
proportional to 



St{n-l,x,( U-) 



E(x. -x)^ 

1 • 

n- 1 



St{n-2,m,(l+l)-} 

K V 



(2.6) 
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where 



k 



1 

— + 



n 




s 



XX 



m 




x) 



V = n-2 



and 





XX 



and St{k,b,c} is the usual non-central Student's t-distribution with density 
function given by 



f(u;c,b,k) = 



Be(^,^) (kc) 






(3.7) 



The constant of proportionality in (2.6) must be obtained by numerical 
integration. The predictive distribution of (2.6) enables us to obtain 
either point or interval estimates of X^. The point estimates examined are 

/V 

(a) mean of predictive distribution distribution, and (b) mode of 

rihi 

predictive distribution, X,,^. 

MO 

We have, therefore, five different estimators to be compared: 

(i) the classical predictor X 
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(ii) the inverse predictor 



(iii) the empirical predictor X 



(iv) the mean of the predictive distribution X 



ME 



(v) the mode of the predictive distribution X 



MO 



3. COMPARISON OF UNIVARIATE PREDICTORS 
3. 1 The Data 

The five predictors were compared by applying them to four data 
sets. The data sets consist of measurements of instantaneous oceanic 
whitecap coverage (Y) and wind speed (X), and the object of the exercise is 
the prediction of X^ given a new observation Y^. An initial inspection of 
the data suggested lognormal distributions for both X and Y and a log 
transformation gave an acceptable fit to a Normal distribution. Data points 
for which whitecap coverage was 0.0 were excluded from the analysis for 
several reasons, but particularly because it seemed reasonable to assume 
that a zero whitecap coverage gave no additional information in relation to 
wind speed over and above the conditional distribution of wind speed given 
zero whitecap coverage. The data sets involved were the following: 

Data set 1: Monahan (1971) 

Data set 2: Toba & Chaen (1973) 

Data set 3: JASIN experiment (1978), (Monahan et al . (1981)) 

Data set 4: Strex experiment (1981), (Monahan et al . (1981)) 

The number of (pairs of) non-zero observations in the respective data sets 
were 43, 18, 37 and 78. 
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3.2 Method of Comparison of Estimators 



For each data set, we excluded one data point at a time and 
obtained each of the five estimators based on the remaining data. We then 
predicted the x-value of the excluded point, given the y-value of that 
point, using each of the five estimators. This provided five predicted x- 
values for each point in each data set. Finally, for each of the five 
estimators and for each data set, we calculated the mean bias (MB) and the 
mean-squared prediction error (MSPE) defined as follows for a given data 
set : 



(3.1) 

(3.2) 

where n is the number of points in the data set. 

3. 3 Results 

The results are presented in Tables 1 and 2. 



MB = y(x. - x.)/n 

MSPE = y(x. - X. )^/n 
1 / 
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TABLE 1 



Bias of Estimators 





/\ 




Estimator 

y\ 






Data Set 






x„ 


X.,r. 


Xw^ 


C 


I 


E 


ME 


MO 


1 


.0150 


.0038 


-.0039 


.1969 


-.0050 


2 


.0119 


-.0068 


-.0141 


-.1110 


-.0092 


3 


.0055 


.0019 


-.0151 


.2831 


.0080 


1] 


-.001 


.0004 


.0030 


.0415 


.0006 



Table 1 shows that, in terms of bias, the estimator (i.e,, the mean of 

ML 

the predictive distribution of x) is uniformly the worst of the five 
estimators and the inverse estimator almost uniformly the least biased. 
The estimator is close to but slightly worse than, x^ in terms of bias. 
A two-way analysis of variance applied to the data of Table 1 yielded the 
obvious results in terms of significance. 
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TABLE 2 



Mean-Squared Prediction Error 



Data Set 


X > 

o 

1 


^I 


Estimator 

"e 


A 

ME 


MO 


1 


.192 


.059 


.056 


.060 


.060 


2 


.205 


.082 


.086 


.082 


.083 


3 


.103 


.066 


.060 


.06? 


.067 


4 


.149 


.061 


.062 


.061 


.061 



This table shows that in terms of average squared prediction error, the 
classical predictor is once again uniformly the poorest, having mean 
prediction error in the range 2 to 3 times that of any of the other 
estimators. The remaining four estimators are very close in terms of 
predictive capacity for those data sets, with none uniformly better than the 
others. Once again, a two-way ANOVA yielded the expected results. 

One advantage of the Aitchison and Dunsmore method is that it 
produces, in addition to the point predictions, the predictive distribution 
of X given y = y^. From this it is possible to obtain shortest 100(a$) 
confidence intervals for x given y = y^. 

M. INTERVAL ESTIMATION FOR THE WIND/WHITECAP DATA 

For each predicted value, and for each data set, the following 95% 
confidence intervals have been constructed. 
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11. The standard confidence interval based on the inverse 

regression — i.e., regression of U on W. 

12. An interval based on the Lwin & Maritz estimator, and using the 
standard deviation of the closely related estimator E8, derived in 
Section 8. 

13. An interval based on the classical estimator , and described 
in Brownlee (1965) 

The results are presented below; 

TABLE 3 



Confidence Intervals 



Data Set 




11 






12 


13 




% cov 


Av 


length 


% cov 


Av length 


% cov 


Av length 


1 


97.7 




.90 


97.7 


.89 


96.3 


1 .72 


2 


96.6 




1.04 


94.6 


1.01 


95.6 


2.23 


3 


89.2 




.96 


94.6 


.95 


90.3 


1.84 


A 


93.5 




.96 


92.3 


.96 


94.2 


1.62 


In general , 


intervals 


11 and 


12 are 


very comparable. The 


analysis was 


performed , 


as in 


the 


case of 


poi nt 


estimation, by 


omitting 


each point in 



turn and constructing a confidence interval based on an analysis of all the 



remaining points of the data set. 



5. THE MULTIVARIATE CASE 



Brown (1982) has studied the case p > q, not both 1, in some depth. 
While most of his analysis relates to the case of controlled calibration 
(i.e., X not random), he does devote some attention to the random 
calibration situation. The model employed is a generalization of (2.1), 
viz. , 



where Y (nxq), E (nxq) and X (nxp) are random matrices, and E is a 



variables are post hoc centered at zero, we can, without loss of generality, 
rewrite equation 5.1 so that the constant term disappears and hence we have 



Brown (1982) suggests three estimators for the multivariate situation. 
These are analogous to the predictors X , X and X of Section 2 and are 

LI Cj 

derived as follows: 

I, From regression of Y on X (denoted L by Brown), and analagous to X^. 



Y 



la"^ + X B + E 



(5.1) 




Y 



X B + E 



(5.2) 




(B S ^ B) ^ B S ^ ^ 



(5.3) 
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where is the newly observed single value of Y which we are to use to 

yv A 

predict X, and B, S are the usual least squares estimators of B, ^ (Mardia, 
et al . , 1979). Note that if we replace B, S by their univariate 

y^ 

counterparts, and putting a = 0 (following centering of the data) equation 
(5.3) does, as expected, reduce to equation (2.2). 

The analysis which produces here performs a multivariate regression 

of Y on X. Brown ( 1982) suggests an alternative predictor X^^, , where in 

attempting to predict a component of X (say X.) we regress Y on X . alone, 

J d 

A 

and obtain X , by a formula analogous to (5.3). 

Li 

II. From multivariate regression of X on Y (denoted LB in Brown (1982)) 

X^B = (5.4) 

Note that in this case, each component of X is predicted ignoring all the 
other components of X--in effect we carry out a multiple (not a 
multivariate) regression of each component of X on Y. 

III. A generalization of the empirical method of Lwin & Maritz (denoted E 
in Brown (1982)). The extension is straightforward. Like (L) it uses 
the parametric regression of Y on X and derives that of X on Y by means of 
the empirical distribution of X. Specifically, if y^ is a new (qxl) 
observation on Y the prediction for the corresponding X' (pxl) is 






= I 



w. X'. 
1 — 1 



(5.5) 



13 



where X| is the ith observation on X and 

n 

w. = f (r |X. )/ I f (Y’ |X. ) (5.6) 

^ ^ i=1 ^ 

In the case of our analysis, f was assumed to be the multivariate normal 
regression density (Mardia et (1979), ch. 6) with parameters fixed at 
their least squares values. It is, of course, also possible to obtain the 

estimator X„ for the problem of predicting each component of X separately, 

£j 

given y^. This estimator is denoted by X , , following Brown (1982). 

— U Cj 

The above 5 predictors were applied to five data sets, constituted as 
follows; 

(a) the four data sets of Section 2, each augmented by the inclusion of 
additional x-variables, viz., surface water temperature, and air temperature 

Ci.c., Q “ 1, p “ 33. 

(b) the data set provided in Brown (1982), Section 4, relating four 
infrared reflectance responses of wheat (Y) to determination of percent 
water, X^ , and percent protein, X^, of the wheat [i.e., q = 4, p = 2]. 

The various predictors were compared using the same criteria as in 
Section 2, viz., the mean-squared prediction error where one point at a time 
is omitted, and then the 2 ^-value for that point is predicted using all the 
remaining points for estimation purposes. 

6. ANALYSIS OF RESULTS FOR MULTIVARIATE CASE 
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We consider the results for the Brown data and the Wind/Whitecap data 
separately. In Table 3 we present the results for the data of Section 3* 

TABLE 3 

Mean Squared Prediction Error 









Predict or 








Data Set 






^LB 


^E 


"e- 


^LB* 


1 


.095 


. 192 


.059 


.059 


.056 


.0110 


2 


.550 


.205 


.082 


.079 


.086 


. 110 


3 


. 1 U 


. 103 


. 066 


.061 


.060 


.072 




.1118 


.1119 


.061 


.062 


.062 


.056 



Before comparing these predictors, a number of points should be noted. 

(i) , is simply the classical estimator when only wind and whitecap 
variables are taken into account so that this is identical with the 

estimator of Section 2. 

(ii) By definition, X, „ predicts each component of X separately and 

L D 

hence this also is the univariate X^ (since Y has only one component here). 

(iii) Included in column 6 of Table 3 is the predictor X, , obtained 
simply by regressing the wind variable on all other variables in the 
analysis [whether X or Y ] . 

A comparison of the columns of Table 3 reveals that none of the 
multivariate methods used leads to any noticeable improvement in terms of 
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predictive capacity over the ”best” univariate predictor X (X ) . Among the 

1 LB 

/V /V /s 

truly multivariate of these methods [X, ,X ], the empirical X^, holds up 

L L h 

extremely well, whereas the classical multivariate again is uniformly the 
worst . 

In Table we present the results for the Brown data: 

TABLE 

Mean Squared Prediction Error 



Method 

Variable 

+ 




L 


L' 


E 


E' 


LB 


"l 




.003 


.003 


on 

o 


.017 


.003 


"2 




.041 


.041 


.298 


.076 


O 

* 



A comparison of the columns of Table ^ confirms the result of Brown (1982) 
that the methods L, LB are virtually indistinguishable in terms of 
predictive performance for this data set. This is at variance with all 
previous univariate results, and with the multivariate conclusions for the 
wind/whitecap data. As printed out in Brown (1982), these results should be 
treated with some caution, as the data are perhaps atypical in that such a 
large percent of the variation is explained by the model. (Brown predicted 
the x-values of 5 points using the remaining 16, and found for methods L, LB 
in all cases over 98 % of variation explained by the model. A similar 
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analysis by us for 15 other random samples of size 5 yielded an average of 
Just under 98% of variation explained.) 

Another interesting outcome of this analysis is the relatively poor 
performance of the method E for this data set. Our results confirm those of 
Brown, and in fact indicate that E is worse than in his analysis. 
Incidentally, an examination of the w^ (weights) involved in method E 
reveals that when we go to the multivariate case we are dealing with 
extremely small numbers (<< exp (-30)) and for this reason the method may be 
very susceptible to differences in computational precision in this case. 
The method held up well for the wind/whitecap multivariate extension (which 
involved the inclusion of additional X's) but has not performed well in this 
case with the inclusion of additional Y's. This may be because the 
inclusion of additional Y's increases the dimension of the regression 
density function, whereas the inclusion of additional X's does not. 

In fact, in view of the results presented in later sections, a number 
of aspects of the analysis of this data set are not at all surprising. 
Firstly, since the data indicate a very strong underlying correlation, it is 
to be expected that the classical estimator will perform well. Secondly, 
for the same reason, we can expect the Lwin & Maritz type estimator to 
perform poorly. 

7. THE SIMULATION STUDY 

In Section 1, we evaluated the performance of a number of point and 
interval estimators of wind speed given whitecap coverage when applied to 
each of four data sets. The five estimators involved were: 
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El 


(i) 


the inverse 


E2 


(ii) 


the classical 


E3 


(iii) 


estimated empirical Bayes 


E4 


(iv) 


mean of predictive distribution 


E5 


(v) 


mode of predicted distribution 



together with corresponding interval estimators, each of which is defined in 
Section 3. The general conclusion drawn was that, with the exception of 
estimator (ii), which was considerably inferior, all the other estimators 
are broadly comparable in terms of predictive performance. This conclusion 
is supported by the results of several previous studies. 

In this section we further evaluate the performance of these estimators 
by computer simulation. We concentrate in particular on the robustness of 
the estimators, and on the effect of sample size on the predictive ability 
of the estimators. The classical assumption is that both variables in the 
calibration study have normal distributions; this is the first situation we 
have studied. We have subsequently allowed for non-normal distributions for 
each variable in turn, and for both together. Another factor which has 
emerged as being of importance in determining the relative and absolute 
merits of the different estimators is the (true) correlation between the two 
variables, and the effect of this factor has also been examined. 

This section is divided into two parts; the first. Section 8, is 
concerned with the point estimators, and the second. Section 9, with 
interval estimators. 

8. COMPARISON OF POINT ESTIMATORS 
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8.1 The Point Estimators 



The estimators being compared are the five referred to above with 
the following additions: 

(a) Two alternative versions of estimator E3 [the Empirical Bayes 
estimator] are developed, viz., 

E6 : assuming the errors follow a Student t-distribution, and estimating its 

variance in the standard manner and 

E7 : as in (i), except that we use a maximum likelihood estimate of the 

variance of the t-distribution. 

(b) A further alternative version of estimator E3 is derived by assuming 

2 2 

that the distributions of X and yIx are N(y ,a ) and N(a+6X,a ), 

I ^x X y 

respectively. Then, by straightforward probability calculus we have 



f(x|y) - n{ 



2 2 
6(y-a)a^ + 

o2 2 2 ’ 

go +0 
X y 



Hence an "empirical" Bayes estimator of X given Y is 



( 8 . 1 ) 



E8 : 



'' 2 '' '' —''2 

a g(y-a) + xa 
X y 



^ 2-2 2 
go + a 
X y 



( 8 . 2 ) 



2 2 2 

Note that as a /a 0 (i.e., p ^1, where 
y X 

coefficient of X and Y), this estimator 



p is the (true) correlation 

ys /\ 

(y-a)/g (g * 0), the classical 
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estimator. Therefore, certainly for large samples, we would expect the 



performance of the classical estimator, which in general is not good, to 
improve as p ->■1. Note further that (see Appendix A) the estimator E8 is 
virtually identical with El for any reasonably large sample size N, thereby 
providing justification for the use of the estimator El. 

8.2 Simulation 

The criterion of comparison of the different estimators is their 
mean-squared error of prediction. The basic assumption is that we have two 
random variables X, Y such that 



E{y|x} = a + 6X 
V(y|x) = Oy 



( 8 . 3 ) 



The study involves a number of different assumptions concerning the form of 
the distributions of X and y|x and these are detailed below. The (true) 
values of a and 6 are taken to be 0 and 1, respectively. An initial random 
sample of size n is generated from which the predictive relation is derived. 
Then 100 further pairs of observations were simulated from the same true 
model, and a prediction of the x-value corresponding to each y-value is made 
using each of the eight estimators. 

The above exercise was carried out 2000 times for every 
combination of the following parameters: 

Sample size N 20, 40, 80 

Squared corr, coefficient .1, .5, .9 
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The overall exercise was 


repeat ed 


for each of the 


following 


combinations 


of 


assumptions regarding 


the forms 


of the distribution of X and 


Y: 












1 . 


X: 


N(0, 1) 


Error : 


Normal, mean 0 




2. 


X: 


N(0,1) 


Error : 


t 3 d.f. 




3. 


X: 


N(0,1) 


Error : 


Stretched Normal 


(Gaver 










(1982)) 




4. 


X: 


t, 3 d.f., variance 


1 Error: 


Normal , mean 0 




5. 


X; 


t , 3 d .f . , variance 


1 Error : 


t, 3 d.f. 





The error variance in each case was fixed so as to give the required 
correlation between X and Y. 

The results arising from each series of assumptions are presented 
in Tables 8.1 through 8.5, respectively. 
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TABLE 8. 1 



MSE of Various Predictors 



p -squared 



. 1 



.5 



.9 



X: Normal 


Error : 


Normal 




Estimator 


O 

(\J 

II 


O 

.=r 

II 


N = 80 


El 


1.02 


0.95 


0.92 


E2 


1^19.50 


2063. OH 


17H. 1H 


E3 


1.02 


0.95 


0.92 


Ei| 


1.01 


0.95 


0.92 


E5 


1.01 


0.95 


0.92 


E6 


1 .07 


1 .00 


0.96 


E7 


1.01 


0.96 


0.92 


E8 


1 .01 


0.95 


0.92 


El 


0.56 


0.52 


0.51 


E2 


1 .51 


1.21 


1 .OH 


E3 


0.58 


0.53 


0.51 


Ei) 


0.56 


0.52 


0.51 


E5 


0.56 


0.52 


0.51 


E6 


0.63 


0.58 


0.56 


E7 


0.62 


0.56 


0.53 


E8 


0.56 


0.52 


0.51 


El 


0. 1 1 


0. 10 


0. 10 


E2 


0.13 


0.12 


0. 1 1 


E3 


0.15 


0. 12 


0. 1 1 


E4 


0. 1 1 


0. 10 


0. 10 


E5 


0.11 


0. 10 


0.10 


E6 


0. 18 


0.13 


0.13 


E7 


0. 18 


0.13 


0.13 


E8 


0.11 


0. 10 


0. 10 
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TABLE 8.2 



p-squared 



. 1 



.5 





MSE 


of Various Predictors 






X: 


Normal Error: 


t, 3 d.f. 




Estimator 


N = 20 


N = 40 


N = 80 


El 




1 .04 


0.95 


0.98 


E2 




95.13 


1 41 .63 


19.67 


E3 




0.97 


0.92 


0.90 


E4 




1.00 


0.95 


0.97 


E5 




0.99 


0.95 


0.95 


E6 




0.96 


0.88 


0.85 


E7 




0.93 


0.87 


0.83 


E8 




1.03 


0.95 


0.98 


El 




0.57 


0.49 


0.51 


E2 




1.32 


1.02 


1 .03 


E3 




0.51 


0.48 


0.48 


E4 




0.56 


0.49 


0.51 


E5 




0.55 


0.49 


0.50 


E6 




0.50 


0.47 


0.44 


E7 




0.50 


0.46 


0.43 


E8 




0.56 


0.49 


0.51 


El 




0. 1 1 


0. 1 1 


0.12 


E2 




0.12 


0.12 


0.13 


E3 




0.14 


0. 1 1 


0. 1 1 


E4 




0. 1 1 


0. 1 1 


0.12 


E5 




0. 12 


0. 1 1 


0. 12 


E6 




0.16 


0.12 


0. 1 1 


E7 




0. 17 


0.12 


0. 1 1 


E8 




0. 1 1 


0. 1 1 


0.12 
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TABLE 8.3 



p -squared 



. 1 



.5 



MSE of Various Predictors 





X: 


Normal Error: 


Stretched 


Normal 


Estimator 


N = 20 


N = 40 


N = 80 


El 




1.00 


0.97 


0.92 


E2 




3230.46 


66.92 


1 4. 31 


E3 




0.97 


0.95 


0.91 


E4 




0.98 


0.97 


0.92 


E5 




0.99 


0.92 


0.92 


E6 




0.96 


0.93 


0.88 


E7 




0.95 


0.92 


0.87 


E8 




1.00 


0.97 


0.92 


El 




0.56 


0.56 


0.52 


E2 




1.30 


1.19 


1 .07 


E3 




0.52 


0.52 


0.48 


E4 




0.55 


0.55 


0.52 


E5 




0.55 


0.56 


0.52 


E6 




0.52 


0.50 


0.46 


E7 




0.51 


0.49 


0.46 


E8 




0.56 


0.56 


0.52 


El 




0. 1 1 


0. 1 1 


0. 10 


E2 




0.13 


0. 12 


0. 1 1 


E3 




0.15 


0.1 1 


0.10 


E4 




0.11 


0. 1 1 


0. 10 


E5 




0.11 


0. 1 1 


0. 10 


E6 




0.17 


0.12 


0.13 


E7 




0.17 


0.13 


0. 1 1 


e8 




0.1 1 


0. 1 1 


0. 10 
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TABLE 8.4 



p-squared 



. 1 



.5 



MSE of Various Predictors 





X: t, 


3 d.f. Error: 


Normal 




Estimator 


N = 20 


N = 40 


N = 80 


El 




0.99 


0.95 


0.92 


E2 




2919.72 


0.99 


0.97 


E3 




0.92 


0.93 


0.97 


E4 




0.98 


0.95 


0.92 


E5 




0.98 


0.95 


0.93 


E6 




0.96 


1.00 


0.98 


E7 




0.92 


0.95 


0.96 


E8 




0.98 


0.95 


0.92 


El 




0.58 


0.55 


0.54 


E2 




1 .82 


1.14 


1 . 18 


E3 




0.61 


0.58 


0.56 


E4 




0.59 


0.55 


0.54 


E5 




0.59 


0.57 


0.55 


E6 




0.72 


0.70 


0 . 68 


E7 




0.69 


0.69 


0.66 


E8 




0.59 


0.55 


0.54 


El 




0. 1 1 


0.10 


0.10 


E2 




0.13 


0. 1 1 


0. 1 1 


E3 




0.21 


0.18 


0. 16 


E4 




0. 1 1 


0. 1 1 


0. 10 


E5 




0. 1 1 


0.12 


0. 10 


E6 




0.35 


0.33 


0. 32 


E7 




0.35 


0.34 


0.33 


E8 




0.12 


0. 1 1 


0.10 
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TABLE 8.5 



p-squared 



. 1 



.5 





MSE 


of Various 


Predictors 






X; 


t, 3 d.f. 


Y: t, 3 d.f. 




Estimator 


N = 20 


1 N = 40 


N = 80 


El 




1.09 


0.87 


0.87 


E2 




1652.77 


1 .00 


0.84 


E3 




1.06 


0.89 


0.79 


Ei< 




1 .06 


0.86 


0.78 


E5 




1 .05 


0.86 


0.77 


E6 




1.05 


0.85 


0.76 


E7 




1 .01 


0.83 


0.73 


E8 




1.09 


0 . 88 


0.78 


El 




0.56 


0.55 


0.52 


E2 




2.51 


1.26 


0.97 


E3 




0.56 


0.55 


0.53 


E4 




0.57 


0.55 


0.52 


E5 




0.58 


0.54 


0.53 


E6 




0 . 66 


0.61 


0.63 


E7 




0.65 


0.60 


0.62 


E8 




0.57 


0.55 


0.52 


El 




0.13 


0. 1 1 


0.12 


E2 




0.13 


0. 12 


0.12 


E3 




0.23 


0. 16 


0.17 


E4 




0.13 


0. 1 1 


0.12 


E5 




0.13 


0.11 


0.12 


E6 




0.59 


0.50 


0.47 


E7 




0.58 


0.50 


0.46 


E8 




0.13 


0. 10 


0.12 
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8. 3 Discussion of Simulation Results 



The criterion used for comparison of estimators — the mean-squared 
error — is, of course, scale dependent, and therefore the only meaningful 
comparison between estimators is the percentage difference in mean squared 
error . 

Looking first at Table 8.1 (X and the error both Normal), we see 

that El, E4 , E5 and E8 are virtually indistinguishable in terms of 

predictive performance. The Lwin & Maritz type procedures (E3, E6, and E7) 

2 

are somewhat inferior particularly for small sample size and/or large p . 

2 

For example, for N = 20, p = .9, the appropriate Lwin & Maritz estimator 

(E3) is approximately 36)5 worse than the four "good" estimators in terms of 

mean squared error. The classical estimator (E2) turns out to be just as 

bad as might be expected from previous studies, although it does, as we 

2 

predicted it should, appear to improve as p increases. 

In Table 8.2 (X Normal, error having a t-distri but i on ) , the four 

estimators El, E4 , E5 and E8 are again essentially identical in their 

performance. The classical estimator is again poor, with the same proviso 

as above. However, the Lwin & Maritz type estimators (E5, E6, E7) now 

2 

perform very well, except for a combination of small N and large p . The 

superiority of the most appropriate (and best) of these estimators (E7) is 

2 

of the order of 10 to 15 percent reduction in mean squared error f or p in 

2 

the range .1 to .5, although for larger p and small sample sizes this 

2 

difference is smaller, and in one case is reversed (N = 20, p = .9). This 
is a general pattern that has emerged: The Lwin & Maritz type estimators do 
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2 

not perform well for large p , particularly when the corresponding sample 
sizes are small. 

In Table 8.3 (X still Normal, the error having even longer, more 
straggling tails), the pattern is very similar to that of Table 7.2. Once 
again the estimators El, E4, E5, E8 are broadly comparable. E2 is poor, and 
estimators E3, E6 and E? are, with the type of exception mentioned above, 
superior (involving a reduction of up to about 12? in mean squared error). 

In the remaining tables, we allow X to be non-Normal. In the case 
of Table 8.4 (X, t distribution, error Normal), estimators El, E4, E5 , and 
E8 are still virtually identical. E2 is still the worst, but E3 (which one 

would expect, given its definition, to be good in this case) is superior 

2 2 
only for small p , and this superiority is most marked for small p , and 

2 

this superiority is most marked for small N. For moderate p (.5), E3 is 

2 

marginally worse than El, E4, E5, and E8, and for large p , E3 is distinctly 

inferior. E6 and E7 are, in general, as might be expected in this case, 

very poor in their predictive capacity. 

Finally, in Table 8.5 we have the case where neither X nor the 

error is Normally distributed. The pattern of Table 7.6 continues here, 

except that the cases where E3 is superior are even more limited, and the 

2 

inferiority of E6 and E7 for large p even more pronounced. 

Some general conclusions can now be drawn from the combined 

results: 

2 

1. For all p , and all N, regardless of underlying distributions, the 
estimators El, E5, E6, and E8 are indistinguishable in terms of predictive 
performance, when that performance is measured in terms of MSE. 
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2. When both X and the error are Normal, one of the estimators El, E4, 

E5 , and E8 should be used. The Lewin & Maritz type estimator can be 

2 

inferior in this case, particularly for large p and small N. 

3. When X is Normal, but the error is not, the LM estimators can be 

2 

superior, except when there is a combination of high p , and small N. 
Modifying the LM estimator to take account of the form of the error 
distribution (E6, E7 ) does lead to further reduction of the mean-squared 
error . 

4. When X is long-tailed non-Normal , the range of superiority of the LM 

2 

estimators is very limited — in fact it only occurs for small p , and is most 
marked for small N. Calibration is probably not a very appropriate 
technique in that situation. Therefore, when X is non-Normal, one should 
probably utilize one of the estimators El, E4 , E5 or E8. 

5. The estimator E8, which has not been studied before, performs very 
well in general. The inverse estimator El performs equally well, but 
estimator E8 has some appealing properties, viz. 

(i) it can be derived directly from our assumptions (7.3) and 

(ii) it leads to a simple and reliable confidence interval (cf. 
Section 8) 

(iii) Simple algebra (Appendix A) will show that E8 is essentially 
almost identical with El, thus providing justification for use 
of El . 

6. Estimators E4 and E5 also perform well, but are computationally more 
difficult to obtain, and do not yield easily computable confidence 



29 



intervals. 



However, they do give a readily computable predictive 



distribution. 

7. Sample size is not a major factor in the absolute size of the mean- 

squared error. Reading across any of the rows of any of the tables 8.1 

through 8.5, we see relatively little reduction in MSE as we go from N = 20 

to N = ^0 to N = 80. The reduction is certainly small compared to the 

2 2 2 

reduction as we go from p =.1top =.5top = .9. This is not, of 
course, very surprising: it merely indicates that the main determining 

factor in the predictive capacity of the various calibration estimators is 
the strength of the actual (linear) relationship between the relevant 
variables. Nevertheless, certain ways of processing the data can have 
considerable advantages. 



9. COMPARISON OF INTERVAL ESTIMATORS 
9 . 1 The Interval Estimators 

Although numerous point estimators have been derived and studied 
in connection with the calibration problem, the study of the interval 
estimation problem has been much less extensive. In this section we 
examine, again by means of simulation, the performance of a number of 
interval estimators. These estimators are as follows. 

1. For the point estimator El, we use the standard 95% confidence 
interval for the predicted value of X, given y = y^, viz. 



El 



i (1 *1 



(y-yo> 1/2 

* “s ' " ^ ■“ I0.025 

yy 
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where 



^2 

S - e S 
XX yy 

N-2 



2. Brownlee (1965) has suggested a 95^ confidence interval related to the 
approach of point estimator E2. This interval, referred to in the following 
tables as the classical interval, has the disadvantage that it fails to 
exist in certain circumstances. Its performance is examined. 

3. An empirical Bayes-type confidence interval based on the derivation of 
E8 is given by 



E8 



± t 



0.025 




and described herein as empirical Bayes type 1. 

4. Lwin & Maritz have an alternative suggested procedure for deriving an 
empirical Bayes confidence interval. Three different intervals of this type 
are calculated, viz., 

(i) an interval based on assuming a normal distribution for the error 
term, and denoted by type 2; 

(ii) an interval based on assuming a t-distribut ion for the error, and 
estimating its variance by the standard method — a type 3 empirical Bayes 
interval; 
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(iii) similar to (ii), except that a maximum likelihood approach is used to 
estimate the variance. This we call a type 4 interval. 

All these intervals have the property (see Lwin & Maritz (1980)) that 
they can be semi-infinite. As such intervals make the calculation of 
average interval length impossible, and as their occurrence is rare, we omit 
them from our calculations, and merely record their frequency of occurrence. 
5. It is possible to construct confidence intervals based on the predictive 
distribution of Aitchison & Dunsmore, but since this involves, for a single 
y-value, repeated numerical integration it is omitted from the simulation 
study . 

9. 2 Design of the Simulation Study 

The design of the simulation study is identical with that in 

Section 7, except that, due to the omission of the (computationally lengthy) 

Aitchison & Dunsmore estimators E4 and E5, we are able to greatly expand the 

2 

number of replications at each setting of p , N. In fact we now repeat the 

experiment (of generating a sample, and 100 additional pairs of observations 

for prediction based on the sample) 2000 instead of 100 times. This means 
2 

that for each p , N configuration, we are constructing 200,000 confidence 
intervals. The intervals so constructed are compared in terms of percent 
coverage and average length. 

The results are presented in Tables 9.1 through 9.5; these tables 
have a direct correspondence with Tables 8.1 through 8.5. 
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TABLE 9.1 



Performance of Various Confidence Intervals 
X: Normal Error: Normal 

Confidence Interval 



squared 


Inverse 


Emp. 


Bayes 


Classical 




Sample 

Size 


% 

Cov . 


kv. 

Length 


% 

Cov. 


Av. 

Length 


% 

Cov. 


Av. 

Length 


% 

exist 


20 


94.8 


4.0 


93.1 


3.7 


94.6 


35.6 


66.6 

(22.7) 


40 


94.9 


3.8 


94.1 


3.7 


96.6 


37.0 


83.2 

(12.1) 


80 


94.9 


3.7 


94.5 


3.7 


97.3 


32.4 


96.0 

(2.9) 


20 


94.8 


3.0 


93.6 


2.8 


96. 1 


6.2 


(99.6) 

(0.4) 


40 


95. 1 


2.9 


94.4 


2.8 


95.7 


4.4 


100.0 

0 


80 


94.7 


2.8 


94.3 


2.7 


95.0 


4.1 


100.0 

0 


20 


94.9 


1.3 


93.5 


1.3 


95.0 


1.4 


100.0 

0 


40 


95.0 


1.3 


94.4 


1.2 


95. 1 


1.3 


100.0 

0 


80 


94.9 


1.2 


94.6 


1 .2 


94.9 


1.3 


100.0 
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TABLE 9. 1 (continued) 



p squared 


Emp. Bayes 


2 


Emp 


. Bayes 


3 


Emp 


. Bayes 


4 


Sample 

Size 


% 

Cov. 


Av. 

Length 


% 

Exist 


% 

Cov. 


Av. 

Length 


% 

Exist 


% 

Cov . 


Av. 

Length 


% 

Exist 


.1 20 


86.9 


3.2 


100.0 


88.3 


3.5 


100.0 


88.5 


3.5 


100.0 


40 


90.6 


3.5 


100.0 


92.2 


3.7 


100.0 


92.4 


3.8 


100.0 


80 


93.5 


3.7 


100.0 


94. 1 


3.8 


100.0 


94.2 


3.9 


100.0 


.5 20 


82.7 


2.4 


100.0 


78.2 


2.2 


100.0 


87.2 


2.5 


100.0 


40 


90.2 


2.7 


100.0 


86.4 


2.5 


100.0 


89.7 


2.7 


100.0 


80 


92.6 


2.7 


100.0 


89.0 


2.5 


100.0 


92.3 


2.8 


100.0 


.9 20 


78.8 


1 . 1 


99.0 


42.6 


0.6 


98.6 


50.2 


0.7 


98.9 


40 


86.3 


1 . 1 


99.8 


45.5 


0.5 


99.4 


55.5 


0 . 6 


99.6 


80 


90.5 


1.2 


99.8 


46.6 


0.5 


99.6 


59.2 

0 


0.6 


99.8 
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TABLE 9.2 



Performance of Various Confidence Intervals 
X: Normal Error: t 3 d.f. 

Confidence Interval 



p-squared 


Inverse 


Emp. 


Bayes 1 


Classical 




Sample 

Size 


% 

Cov. 


Av. 

Length 


% 

Cov. 


Av. 

Length 


% 

Cov. 


Av. 

Length 


% 

exist 


.1 20 


94.5 


4.0 


92.4 


3.7 


93.9 


39.1 


69.7 

(15.1) 


MO 


94.8 


3.8 


93.8 


3.6 


95.4 


46.8 


(5.9) 


80 


94.9 


3.7 


94.3 


3.6 


95.9 


26. 1 


95.5 

(1.6) 


.5 20 


93.8 


2.8 


92.2 


2.6 


94.1 


7.3 


1.3 

(1.5) 


40 


95.1 


2.8 


94. 1 


2.7 


94.7 


4.6 


99.7 


80 


95.5 


2.8 


95. 1 


2.8 


95.2 


4. 1 


99.9 


o 

C\J 


93.4 


1.2 


92.3 


1 . 1 


93.3 


1.3 


99.9 


40 


94.6 


1 .2 


94.1 


1.2 


94.5 


1.3 


100.0 


80 


95.2 


1 .2 


95.0 


1.2 


95. 1 


1.3 


100.0 
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Table 9.2 (continued) 



p Squared 


Emp . Bayes 


2 


Emp 


. Bayes 


3 


Emp. Bayes 


4 


Sample 

Size 


% 

Cov. 


Av. 

Length 


% 

Exist 


% 

Cov . 


Av. 

Length 


% 

Exist 


% 

Cov. 


Av. 

Length 


% 

Exist 


. 1 20 


85.9 


3. 1 


100.0 


87.2 


3.2 


100.0 


87.5 


3.3 


100.0 


40 


91 .9 


3.5 


100.0 


93.5 


3.8 


100.0 


93.7 


3.9 


100.0 


80 


93.9 


3.6 


100.0 


94.5 


3.8 


100.0 


94.6 


3.8 


100.0 


.5 20 


82.2 


2.2 


99.8 


75.0 


1.9 


100.0 


77.9 


2. 1 


100.0 


40 


88.3 


2. 4 


99.9 


84.2 


2. 1 


100.0 


86.3 


2.2 


100.0 


80 


91 .7 


2.5 


99.9 


45.3 


0.6 


98.9 


48. 1 


0.6 


98.9 


.9 20 


75.9 


1 .0 


98.6 


45.3 


0.6 


98.9 


48. 1 


0.6 


98.9 


40 


85.9 


1 . 1 


99.5 


50.5 


0.5 


99.3 


54.3 


0.5 


99.4 


80 


90.5 


1 . 1 


99.9 


51.5 


0.4 


99.7 


55.7 


0.4 


99.7 
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TABLE 9.3 



Performance of Various Confidence Intervals 
X; Normal Error: Stretched Normal 

Confidence Interval 



p -squared 


Inverse 


Emp. 


Bayes 1 


Classical 




Sample 

Size 


% 

Cov. 


Av. 

Length 


% 

Cov. 


Av. 

Length 


% 

Cov . 


Av. 

Length 


% 

exi St 


.1 20 


94.5 


4.0 


92.6 


3.7 


94.2 


58.7 


67.5 

(17.6) 


40 


94.7 


3.8 


93.8 


3.7 


95.3 


36.7 


18.6 

(8.5) 


80 


94.9 


3.7 


94.4 


3.6 


96.0 


26.7 


95.2 

(2.0) 


.5 20 


93.8 


2.9 


92.2 


2.7 


94.1 


6.3 


99. 1 
( .5) 


40 


94.4 


2.7 


93.4 


2.7 


94.1 


4.3 


99.8 


80 


94.9 


2.7 


94.6 


2.7 


94.8 


4.1 


100.0 


.9 20 


93.0 


1.2 


91 .9 


1 . 1 


93.0 


1.3 


100.0 


40 


93.9 


1.2 


93.5 


1.2 


93.8 


1.3 


100.0 


80 


94.6 


1.2 


94.3 


1.2 


94.5 


1.3 


100.0 
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Table 9.3 (continued) 



p Squared 


Emp 


. Bayes 


2 


Emp 


. Bayes 


3 


Emp 


. Bayes 


4 


Sample 

Size 


% 

Cov . 


Av. 

Length 


% 

Exist 


% 

Cov. 


Av. 

Length 


% 

Exist 


% 

Cov. 


Av. 

Length 


% 

Exist 


.1 20 


85.5 


3.1 


100.0 


87.5 


3.3 


100.0 


87.7 


3.4 


100.0 


110 


90.7 


3.H 


100.0 


92.6 


3.7 


100.0 


93.8 


3.7 


100.0 


80 


93.3 


3.6 


100.0 


94.3 


3.9 


100.0 


94.3 


3.9 


100.0 


.5 20 


82.3 


2.3 


99.9 


76.9 


2. 1 


99.9 


79.4 


2.2 


100.0 


iio 


88.0 


2.4 


100.0 


83.4 


2. 1 


100.0 


90.7 


2.5 


100.0 


80 


91.9 


2.6 


100.0 


89. 1 


2.3 


100.0 


90.7 


2.5 


100.0 


.9 20 


77.3 


1 . 1 


98.9 


48.1 


0.6 


99.8 


50.8 


0.6 


98.9 


HO 


86.5 


1.2 


99.5 


51.4 


0.5 


99.2 


55.9 


0.6 


99.4 


80 


90.0 


1.2 


99.9 


53.9 


0.4 


99.8 


58.9 


0.6 


99.8 
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TABLE 9.i4 



Performance of Various Confidence Intervals 
X: t Error; Normal 

Confidence Interval 



p -squared 


Inverse 


Emp. 


Bayes 


Classical 




Sample 

Size 


% 

Cov. 


Av. 

Length 


% 

Cov. 


Av. 

Length 


% 

Cov. 


Av. 

Length 


% 

exist 


.1 20 


93.5 


3.7 


92.1 


3.5 


94.1 


53.1 


63.2 

(15.9) 


40 


94.1 


3.6 


93.5 


3.5 


96.2 


40.7 


79.5 

(13.3) 


80 


94.5 


3.6 


94.2 


3.5 


97.1 


43.0 


92.5 

(5.2) 


.5 20 


94,1 


2.8 


92.4 


2.6 


96.2 


12. 1 


97.7 

(2.5) 


40 


96.2 


2.7 


93.4 


2.6 


95.9 


5.3 


100.0 

(0.0) 


80 


94.8 


2.7 


94.4 


2.6 


95.3 


4.2 


100.0 


.9 20 


94.4 


1.3 


92.9 


1.2 


94.9 


1.4 


100.0 

(0.0) 


40 


94.9 


1 .2 


94.0 


1 .2 


95.2 


1.4 


100.0 

(0.0) 


80 


94.7 


1.2 


94.3 


1 .2 


95.0 


1.3 


100.0 



( 0 . 0 ) 
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Table 9.^ (continued) 



p Squared 


Emp 


. Bayes 


2 


Emp. 


Bayes 


3 


Emp 


. Bayes 


4 


Sample 

Size 


% 

Cov . 


Av. 

Length 


% 

Exist 


% 

Cov. 


Av. 

Length 


% 

Exist 


% 

Cov . 


Av. 

Length 


% 

Exist 


. 1 20 


88.5 


3.3 


100.0 


90.3 


3.7 


100.0 


90.7 


3.8 


100.0 


40 


91 . 4 


3.3 


100.0 


92.5 


3.7 


100.0 


92.6 


3.9 


100.0 


80 


93.6 


3.5 


100.0 


94.3 


3.6 


100.0 


94.2 


3.6 


100.0 


.5 20 


83.9 


2.2 


99.8 


78.4 


2.0 


100.0 


81 .4 


2.2 


100.0 


40 


90.4 


2.4 


99.9 


86.9 


2.2 


100.0 


89.5 


2.4 


100.0 


80 


92.0 


2.5 


99.9 


88.6 


2.3 


100.0 


91.4 


2.5 


100.0 


.9 20 


78.8 


1 .0 


98.4 


43.8 


0.6 


99.0 


51.4 


0.6 


99.2 


40 


86.0 


1 . 1 


99.0 


45.3 


0.6 


99.4 


56.2 


0.7 


99.5 


80 


90.7 


1 .2 


99.6 


47. 1 


0.5 


99.8 


60.6 


0.7 


99.2 



i*0 



TABLE 9.5 



Performance of Various Confidence Intervals 
X: t Error: t 

Confidence Interval 



p -squared 


Inverse 


Emp. 


Bayes 


Classical 




Sample 

Size 


% 

Cov. 


Av. 

Length 


% 

Cov . 


Av. 

Length 


% 

Cov. 


Av. 

Length 


% 

exist 


.1 20 


93.2 


3.6 


91.5 


3.3 


94.3 


40. 1 


66.2 

( 16 . 1 ) 


40 


94.0 


3.5 


93.1 


3.4 


95.4 


1 41 .0 


79.0 

(9.1) 


80 


94.4 


3.5 


94.0 


3.5 


95.8 


27.2 


93.5 

(2.5) 


.5 20 


92.7 


2.6 


90.9 


2.4 


94.0 


7.0 


96.6 

(1.2) 


40 


94.0 


2.6 


92.7 


2.5 


94.5 


4.5 


99.2 

(0.1) 


80 


94.0 


2.6 


93.6 


2.5 


94.7 


4.0 


99.7 

(0.0) 


.9 20 


93.1 


1.2 


92.7 


1 . 1 


93.5 


1.3 


99.9 

(0.0) 


40 


94.0 


1 . 1 


93.2 


1 . 1 


94. 1 


1.3 


99.9 

(0.0) 


80 


94.4 


1 . 1 


94.1 


1.1 


94.6 


1.3 


100.0 



( 0 . 0 ) 
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Table 9.5 (continued) 



p Squared 


Emp 


. Bayes 


2 


Emp 


. Bayes 


3 


Emp 


. Bayes 


4 


Sample 

Size 


% 

Cov. 


Av. 

Length 


% 

Exist 


% 

Cov. 


Av. 

Length 


% 

Exist 


% 

Cov. 


Av. 

Length 


% 

Exist 


.1 20 


86.7 


3.0 


99.9 


88.4 


3.H 


100.0 


88.7 


3.4 


100.0 


40 


92.3 


3.5 


99.9 


93.6 


3.8 


100.0 


93.7 


3.9 


100.0 


80 


93.6 


3.5 


99.9 


94.2 


3.6 


100.0 


94.2 


3.6 


100.0 


.5 20 


83. 1 


2. 1 


99.5 


75.8 


1 .8 


100.0 


79. 1 


1.9 


100.0 


40 


89.3 


2.2 


99.8 


84.9 


2.0 


100.0 


86.9 


2.0 


100.0 


80 


91 .6 


2.3 


99.8 


88.6 


2.0 


100.0 


90.2 


2. 1 


100.0 


.9 20 


76.8 


0.9 


98.3 


44.2 


0.6 


99.1 


47.8 


0.6 


99.2 


40 


86.9 


1 . 1 


99.0 


51 .7 


0 . 6 


99.5 


55.3 


0.6 


99.6 


80 


90.6 


1.1 


99.3 


55.5 


0.5 


99.7 


59.2 


0.5 


99.7 
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9.3 Discussion of the Results 



We discuss each of the interval types separately. 

9 . 3.1 The Inverse 

This interval performs extremely well, both in terms of % coverage 
and average interval length. Of the intervals studied it has uniformly the 
shortest average length for a given level of coverage. Its robustness in 
terras of coverage is very good. The actual coverage does not fall below 93% 
in any of the five distributional situations considered, and for sample 
sizes 40 and 80 it does not fall below 94%. For situations where X is 
Normal, the coverage is very close to 95%. 

9 . 3.2 The Empirical Bayes type 1 

In terms of % coverage and average length, this empirical Bayes 
interval has a performance profile very similar to that of the inverse. Its 
coverage, in general, tends to be somewhat lower than the required 95%, 
and the average length tends to be marginally less than that for the 
inverse. Its robustness is very similar to that described in relation to 
the inverse. 

9 . 3.3 The Classical 

This interval performs, in general, very poorly. In the first 

instance, we examine the situations where it fails to exist. The final 

column in each table gives the percentage of simulations for which this 

interval existed. In general, no real interval existed for a large 

2 

percentage of the simulations when p was small (.1) and particularly so if 
N was also small. The % of non-existing intervals decreased rapidly (from 
c. 30 - 35 % to 4-5%) as N increased from 20 to 80. A further difficulty also 



emerged, even in cases where the interval existed: in some such cases, the 



lower interval and point given by Brownlee was larger than the upper end- 
point. The percentage of such points is given in parentheses underneath the 

% existence figures in each table. Once again, the problem arises 

2 

predominantly in a small p , small N situation. To overcome this 

difficulty, we interchanged the end-points when this situation arose. 

Having made this adjustment, the interval does indeed give a % coverage 

close to 95/S. However, in terms of interval length, it performs very poorly 

2 

relative to the other intervals, with one exception. As p becomes large 
2 

(p = .9), the average length tends to that of the other intervals. An 

2 

explanation of this behavior is provided by the fact that as p becomes 
2 2 

large ( 0 ^/ 0 ^ ^ 0 in our model) the center point of the Brownlee interval, 
viz . 



g(y - g) 



which in general (if we consider the average lengths of the 95^ confidence 
interval) is not a very good estimator, tends to 



6 

i.e., the classical estimator, E2 . We have already seen that there is 

2 

reason to expect this estimator to be good for large p . 
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9.3.^ The Empirical Bayes (types 2-H) 

In general, these intervals do not perform well, particularly in 

terms of coverage. The coverage is close to 95!S only for the case of small 
2 

p combined with large N. Otherwise the coverage is less than 95?, and in 

2 

some cases (particularly for large p ) very much less than 95?. As we have 
previously noted in the simulation study of point estimators, the 
corresponding point estimators also perform very poorly for the same range 
of parameter values. Intervals of this type are not to be recommended. 

9.3.5 General Conclusions 

Of the six different intervals studied, that associated with the 
inverse point estimator is uniformly the best. It is by far the most robust 
to departures from underlying assumptions, and is strongly recommended for 
use in construction of confidence intervals for the calibration problem. 
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Appendix 



Given y = y^, if we define El to be 



El = a* + B*y, 



(A.1) 



where 



6* = 2^ , and a* = ie - 6*y , 

yy 



using conventional notation, and E8 to be 



E8 = 



o^6(yo - a) ^ X 0^ 

6 0 + 0 
X y 



(A. 2) 



when e B 
then 



xy ^ — "T — . ^2 XX 

— • “ ■ y - 6 X, and c - ^ 

XX 



S - B S 

yy ^ 

n-2 



E8 = 



S S S S - B^S 

JS2L JSL[y _ [y - x] ] ^ X ^ 

n-1 S '■^0 S n-2 

XX xx^. 






S - B S 

yy ^ 

n-2 



and if n is reasonably large, so that n-1 = n-2, then 
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E8 = 



s 

xy 


o 

1 


s 

- xy 
S 

XX 




^2 

B S + 


S 




XX 


yy 


s 


S 


< 


xy 






s 


, V - 

^0 S 


y + , 


yy 


yy 






s 


s 


(x 


-s^y) 


+ JSI 
s 




yy 


yy 


a* 


‘ s* ^0 


= El 



xy 



2 — 2 

S X S 

— xy — 

■ ■ - + y — X 

s s s s 

XX yy XX yy 
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